Applying multiagent reinforcement learning to distributed function optimization problems

نویسنده

  • Chris Mattmann
چکیده

Consider a set of non-cooperative agents acting in an environment in which each agent attempts to maximize a private utility function. As each agent maximizes its private utility we desire a global ”world” utility function to in turn be maximized. The inverse problem induced from this situation is the following: How does each agent choose his move so that while he optimizes his private utility, the world utility is optimized as well? This problem has been considered by the theory of COllective INtelligence (COIN) [6]. This paper focuses on a method for improving a class of search algorithms using Intelligent RL based learning players engaging in a non-cooperative game. The players or ”coordinates” use a more traditional AI approach to learning in a system in which largely the state of the world is not known, and effects of each agent’s actions on the world are not known a priori, and must be discovered. This is essentially a search problem through the possible policies of each coordinate of the underlying system. Search algorithms such as Simulated Annealing, Swarm Intelligence, and Genetic Algorithms, that tradeoff between exploration and exploitation are particularly useful in solving systems of this nautre, because they search directly in the space of RL policies, and try to select optimal policies which pay over time through simulation of indirect learning via optimization of the system; however, these algorithms all have one major drawback: They do not rely on the previous experiences of players in the system to help determine which actions the players should take next. This major drawback leads Wolpert and Tumer to develop a new method for improving search algorithms, Intelligent Coordinates[7]. Intelligent Coordinates use RL to remember and ”learn” how to better solve the system, by applying traditional RL techniques to the exploration stage of these search algorithms. The algorithm improves the binpacking simulation by an order of magnitude, and increases linearly with the number of items added. I present a more detailed explanation of the Intelligent Coordinates algorithm, and discuss two implementations of the bin-packing simulation that I have constructed in Java, one that uses Simulated Annealing to solve, and one that uses Intelligent Coordinates. Further, I contribute an efficient method for processing time-weighted sums, which are needed in the simulation. I also formalize a method for calculating the time-weighted probabilities of picking a particular bin B in the simulation. These probabilities are used to ”mask” the original SA distribution to achieve the more relevant IC probability distribution. I also present some original experiments in the bin-packing simulation that were not present in the original paper[7]. 1 University of Southern California, Los Angeles, email:[email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...

متن کامل

Sparse Tabular Multiagent Q-learning

Multiagent learning problems can in principle be solved by treating the joint actions of the agents as single actions and applying singleagent Q-learning. However, the number of joint actions is exponential in the number of agents, rendering this approach infeasible for most problems. In this paper we investigate a sparse representation of the Q-function by only considering the joint actions in...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

A New Approach for the Solution of MultipleObjective Optimization Problems Based onReinforcement

Many problems can be characterized by several competing objectives. Multiple objective optimization problems have recently received considerable attention specially by the evolutionary algorithms community. Their proposals, however, require an adequate codiication of the problem into strings, which is not always easy to do. This paper introduces a new algorithm, called MDQL, for multiple object...

متن کامل

Resource Abstraction for Reinforcement Learning in Multiagent Congestion Problems

Real-world congestion problems (e.g. traffic congestion) are typically very complex and large-scale. Multiagent reinforcement learning (MARL) is a promising candidate for dealing with this emerging complexity by providing an autonomous and distributed solution to these problems. However, there are three limiting factors that affect the deployability of MARL approaches to congestion problems. Th...

متن کامل

Distributed multiagent learning with a broadcast adaptive subgradient method

Many applications in multiagent learning are essentially convex optimization problems in which agents have only limited communication and partial information about the function being minimized (examples of such applications include, among others, coordinated source localization, distributed adaptive filtering, control, and coordination). Given this observation, we propose a new non-hierarchical...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003